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Abstract 

We consider the problems of error-correcting codes and image restoration with 
multiple stages of dynamics. Information extracted from the former stage can 
be used selectively to improve the performance of the latter one. Analytic re- 
sults were derived for the mean- field systems using the cavity method. We find 
that it has the advantage of being tolerant to uncertainties in hyperparameter 
estimation, as confirmed by simulations. 
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I. INTRODUCTION 



The corruption of signals by noise is a common problem encountered in information 
processing. To retrieve signals from messages corrupted during the transmission through 
noisy channels, various error- correcting codes have been proposed jl]. In particular, the 
error-correction mechanism of a class of parity-checking codes can be considered as the 
search for thermodynamically stable states of a Hamiltonian constructed in terms of the 
message bits 0. These codes have been demonstrated to saturate the Shannon information 
bound in the limit that each encoded bit checks the parity of an infinitely large number of 
message bits While in practice, each encoded bit can only check the parity of a finite 
number of message bits, these codes still maintain a very low bit error probability 

The need to retrieve signals from corrupted messages is also inherent in image restoration 
Q. Although parity-checking bits may not be explicitly introduced for the task, prior 
knowledge about the images plays a similar role. For example, the smoothness of real- 
world images provides a mechanism for checking the pixel values in comparison with those 
of their neighbors. A corresponding Hamiltonian, consisting of a ferromagnetic bias to 
reflect the smoothening tendency, can be constructed in terms of the image pixels. Modern 
techniques of image restoration based on Markov random fields correspond to the search 
for thermodynamically stable states of the Hamiltonian system, using methods such as 
simulated annealing j4|. 

In a recent paper, we have shown that the problems of error- correcting codes and image 
restoration can be formulated in a unified framework ||. In both tasks, the choice of the 
so-called hyperparameters is an important factor in determining their performances. Hyper- 
parameters refer to the coefficients of the various interactions appearing in the Hamiltonian 
of the tasks. In error correction, they determine the statistical significance given to the 
parity-checking terms and the received bits. Similarly in image restoration, they determine 
the statistical weights given to the prior knowledge and the received data. It was shown, 
by the use of inequalities, that the optimal choice of the hyperparameters correspond to the 
Maximum Posterior Marginal (MPM) method, where there is a match between the source 
and model priors. The choice of these values correspond to the Nishimori point in the 
space of hyperparameters 0. It is equivalent to a thermodynamic process at finite tem- 
perature, and the task performance is better than the Maximum A Posteriori probability 
(MAP) method, where the values of the hyperparameters are taken to infinity, equivalent 
to a zero temperature process. Furthermore, from the analytic solution of the infinite-range 
model and the Monte Carlo simulation of finite-dimensional models, it was shown that an 
inappropriate choice of the hyperparameters can lead to a rapid degradation of the tasks. 

In fact, hyperparameter estimation has been the subject of many previous studies [7j|, a 
recently popular one using the "evidence framework" ||. However, if the prior models the 
source poorly, no hyperparameters can be reliable [[|. Even if they can be estimated ac- 
curately through steady-state statistical measurements, they may fluctuate when interfered 
by bursty noise sources in communication channels. Hence it is important to devise decod- 
ing or restoration procedures which are robust against the uncertainties in hyperparameter 
estimation. 

In this paper we propose the technique of selective freezing as a method to increase the 
tolerance to uncertainties in hyperparameter estimation. The technique has been studied for 
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pattern reconstruction in neural networks, where it led to an improvement in the retrieval 



precision, a widening of the basin of attraction, and a boost in the storage capacity [JTO 
The idea is best illustrated for Ising bits or pixels with binary states ±1, though it can 
be easily generalized to other cases. In a finite temperature thermodynamic process, the 
Ising variables keep moving under thermal agitation. Some of them have smaller thermal 
fluctuations than the others, implying that they are more certain to stay in one state than 
the other. This stability implies that they have a higher probability to stay in the correct 
state for error-correction or image restoration tasks, even when the hyperparameters are 
not optimally tuned. It may thus be interesting to separate the thermodynamic process 
into two stages. In the first stage we select those relatively stable bits or pixels whose 
time-averaged states have a magnitude exceeding a certain threshold. In the second stage 
we subsequently fix (or freeze) them in the most probable thermodynamic states (for Ising 
variables this corresponds to the sign of the time-averaged state). Thus these selectively 
frozen bits or pixels are able to provide a more robust assistance to the less stable bits or 
pixels in their search for the most probable states. The selective freezing procedure reduces 
to the usual finite-temperature decoding or restoration process if all bits or pixels are frozen 
(since nothing happens in the second stage), or no bits or pixels are frozen (since the second 
stage is merely a continuation of the equilibration process of the first stage). 

The two-stage thermodynamic process can be studied analytically in the mean-field 
model, which provides a qualitative guide to the behavior of more realistic cases of lower 
dimensions. However, it is necessary to give a remark about the theoretical approach. That 
is, as far as we have tried, the analytical solution has been inaccessible by the more conven- 
tional replica method. Rather, we have to use the cavity method to obtain the equations for 
the order parameters. In particular, the cavity method leads to the appearance of a term 
called the trans-susceptibility, which correctly describes the effects of the thermodynamics 
of the first stage on that of the second. 

The paper is organized as follows. In Section II we briefly review the formulation of error- 
correcting codes and image restoration in a unified framework. In Sections III and IV, we 
consider the mean-field model for error-correcting codes and image restoration respectively. 
We derive the equations for the order parameters of the two-stage thermodynamics using the 
cavity method, and present numerical results illustrating the robustness of selective freezing 
against uncertainties in hyperparameter estimation. We further demonstrate that even when 
the noise model changes without the receiver /restoration agent realizing the change (i.e. it 
makes a wrong estimation of the prior), the task performance is still robust. For the more 
realistic cases of lower dimensions, simulation results illustrate the relevance of the infinite- 
range model in providing qualitative guidance. The conclusion is given in Section V. 



II. FORMULATION 

Consider an information source which generates data represented by a set of Ising spins 
where £j = ±1 and i = 1, • - • , N. The data is generated according to the source 
prior P s ({C,i}). For error- correcting codes transmitting unbiased messages, all sequences are 
equally probable and P s ({£}) = 2~ N . For images with smooth structures, the prior consists 
of ferromagnetic Boltzmann factors, which increase the tendencies of the neighboring spins 
to stay at the same spin states, that is, 
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P.(tt» VTTT^M-Xa,]. (1) 



Here (ij) represents pairs of neighboring spins, z is the valency of each site, and the partition 
function Z(/3 S ) is given by 




Z(&) = Tr e exp . (2) 



The data is coded by constructing the codewords, which are the products of p spins 
^h-ip = &i " ' ' £* P ^ or appropriately chosen sets of of indices {ii, • • • , i p }, the choice of which 
determines the type of code. Each spin may appear in a number of p-spin codewords; the 
number of times of appearance is called the valency z p . The Sourlas code @ is equivalent to 
the infinite-range model in which all possible codewords of p spins are chosen from N spins. 
On the other hand, the Kabashima-Saad code || consists of combinations in which each 
spin appears in a random pre-selection of z p codewords. For conventional image restoration, 
codewords with only p — 1 are transmitted, corresponding to the pixels in the image; the 
inclusion of terms with p > 1, and their positive effects on restoring the original image, 
have also been discussed in ||. For simplicity, we restrict ourselves to the case of a single 
no n- vanishing value of p with p > 2, and p — 1. 

When the signal is transmitted through a noisy channel, the output consists of the sets 
{Jh-ip} an d { r «}) which are the corrupted versions of .3 } and respectively. In the 
binary symmetric channel, the outputs J% x ...i v are equal to T^...* with probabilities pj and 
1 — pj respectively, and r, equal to =f£j with probabilities p T and 1 — p T respectively. Thus 

Pout({J}, {r}|{0) oc exp (/3j ]T J h ... ip C h ■■■Ci P + PrYl nCi) , (3) 

where 

(3j=hn^± and /? T = Iln^I. (4) 
2 pj 2 p T 

The first summation in the exponent of Eq. (|3|) extends over an appropriate set of the 
indices (ix, ■ • • , i p ). 

The Gaussian channel is defined by, for a given sequence 

^out({J}, {r}|{£}) cx exp J2(Jn- ip - Un • " " ^ " ■ ( 5 ) 

Jo and a are the strengths of the signals to be fed into the channel, and J 2 and r 2 are the 
variances of the noise. We note that by letting /3j and (3 r to be Jo/ J 2 and a/r 2 respectively, 
the input-dependent terms of Eq. (|5|) reduce to those of Eq. (j3|), which therefore can be 
regarded as the noise model for both binary symmetric and Gaussian channels. 

According to Bayesian statistics, the posterior probability that the source sequence is 
{er}, given the outputs {J} and {r}, takes the form 

p({°}\{J}, M) « p out ({ j}, M|M)p.(M). (6) 
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Using Eq. fl3|) and (P, we have 



p ({°"}|{^}> { r }) oc exp [ (3j Jh-ipVii ■ ■ ■ <7i P + Pt T i a i + —J2 a i a j ) ■ ( 7 ) 

V z m J 

It often happens that the receiver at the end of the noisy channel does not have precise 
information on f3 T or (3 S . One then has to estimate these parameters. If the receiver esti- 
mates /3, h and j3 m for /3j, (3 T and f3 s respectively, then the mean of the posterior distribution 
of (jj is equal to the thermal average 



Ticr,e- H{a} 

w = 

where the Hamiltonian is given by 



= Tre-^W ' (8) 



H{a} = -/3J2 Jii-ip<7ii ■■•< 7 i P -hJ2 Tidi - — OiOy (9) 

Z (ij) 

One then regards sgn(o"j) as the ith. bit of the decoded/restored information. 

To reduce the sensitivity of the decoding/restoration process to the uncertainties in 
parameter estimation, we propose a two-stage process of selective freezing instead of the 
one-stage thermodynamic process implied by Eq. (||). In the first stage the spins evolve 
thermodynamically as prescribed in Eq. ([|), and the thermal averages (<7j) of the spins 
are monitored. We may relate (<jj) to an effective field Hi by (cjj) = tanhifj. Spins with 
larger magnitudes of (<Ji) correspond to larger magnitudes of Hi. They are more likely 
to agree with the correct message or image bit, and are less likely to change signs even 
when the hyperparameters vary. Their relative stability can be used to assist the less stable 
spins to boost their robustness against hyperparameter uncertainties. Hence we select those 
spins with |(<Tj)| exceeding a given threshold 9, and freeze them in the second stage of the 
thermodynamics. The average of the spin 5$ in the second stage is then given by 

Trcr* n, [6 ((a,) 2 - 9") ^, sgn((Tj> + (£ 2 - <^-> 2 )] e -*W 
= ~ ~ r' 7, r n 77T7 "7777; , 77T > ( 10 ) 



Tr<Ji Eli 


e((a,) 2 


- # 2 )<%,sgn< CTj ) + © 


(0 2 - <^-> 2 ) 


' e ~H{a} 


Trn, 


6«a/) 2 - 




9 2 - (a,)*) 


e ~H{*} 



where 6 is the step function, H{a} is the Hamiltonian for the second stage, and has the same 
form as Eq. ([|) in the first stage. To increase the flexibility in the process, the parameters 
/3, h and (3 m can be replaced by /3, h and (3 m respectively in the second stage. One then 
regards sgn(<jj) as the ith spin of the decoding/restoration process. 

The most important quantity in selective freezing is the overlap of the decoded/restored 
bit sgn (crj) and the original bit & averaged over the output probability and the spin distri- 
bution. This is given by 

M «f = EIl/ dJ Ii J dTPsd^Po^dJlirmmsgniai). (11) 
Following Appendix A of 0, we can prove the following inequality 
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M a{ <M(P = /3j,h = l3 T ,(3 m = p s ), 



(12) 



where the right hand side is the overlap of the single-stage dynamics when the model param- 
eters P, h and (3 m match the source parameters (3j, (3 T and (3 S respectively. Hence selective 
freezing cannot outperform the single-stage process if the hyperparameters can be estimated 
precisely. However, we remark that the purpose of selective freezing is rather to provide a 
relatively stable performance when the hyperparameters cannot be estimated precisely. This 
cannot be revealed from the inequality, but will be confirmed by the analytic and simulation 
results in Sections III and IV. 

III. THE INFINITE-RANGE MODEL FOR ERROR- CORRECTING CODES 

Let us now suppose that the output of the transmission channel consists of only the set 
of p-spin interactions {J^ -iA- The Hamiltonian (^) then becomes 



■<Ji 



(13) 



ti<— <%p 



where we have set j3 m = for the case that all messages are equally probable. 

Analytical solutions for the overlap are in general unavailable. We therefore consider the 
infinite-range model in which the exchange interactions are present for all possible pairs of 
sites in the Hamiltonian of Eq. ( |i~3"|) . 

To consider the transition between error-free and errored regimes, we are interested in 
the noise model in which Ji v ..i p is Gaussian with mean pljo^ ■ ■ ■ £i p /^V" p_1 and variance 
p\J 2 /2N p ~ 1 . Since all messages are equally probable, we can apply a gauge transformation 
°» — ¥ a iii an d Jh-ip —> Jh-ipCh " " *£i p to (|T3"D, and arrive at an equivalent p-spin model with 
a ferromagnetic bias, where 



1/2 



nJ 2 p\ 



cxp 



J 2 p\ 



pi 



N'p- 



-Jo 



(14) 



The Nishimori point for this model is located at (3 = 2j /J 2 . 

The infinite-range model is exactly solvable using mean-field theoretical techniques for 
disordered systems such as the replica or cavity method JTT | . Here we use the cavity method 



because of its more transparent physical interpretation, and some obstacles encountered in 
the use of the replica method. 

The cavity method uses a self-consistency argument to consider what happens when a 
spin is added or removed from the system. The central quantity in this method is the cavity 
field, which is the local field of a spin when it is added to the system, assuming that the 
exchange couplings act only one-way from the system to the new spin (but not from the spin 
back to the system) . Since the exchange couplings feeding the new spin have no correlations 
with the system, the cavity field becomes a Gaussian variable in the limit of large valency. 

A. Average spin in the first stage 



We start with the so-called "clustering property" for mean- field systems 11 
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where ( ) represents thermodynamic averages. As shown in Appendix A, the clustering 
property enables us to express the thermal averages of a spin in terms of the cavity field, 
say, for spin 1, 

(oi) =tanh/5/i X ; h x = ^ Jin-j^n)^ ' ' ' ( a j P } V i ( 16 ) 
where the superscript \1 denotes the thermal averages for a Hamiltonian in which <j\ and 



the associated exchange interactions are absent, but otherwise identical to Eq. (|13"D . Thus 
h\ is the cavity field obeying a Gaussian distribution, whose mean and variance are pj mP 1 
and pJ 2 q p ~ 1 /2 respectively, where m and q are the magnetization and Edwards- Anderson 
order parameter respectively, given by 

m = j^J2( a i) and i = j^Y,^) 2 ■ ( 17 ) 

% i 

It is convenient to write 

(3hi = m+^Ui, (18) 

where 

rh = p(3j m p - 1 and q = ^ 2 J 2 q p -\ (19) 
and Ui is a Gaussian variable with mean and variance 1. 

B. Order parameters in the first stage 

Applying self-consistently the cavity argument to all terms in Eq. fllTD , we can obtain 
self-consistent equations for m and q: 



m = I DwtanhG, (20) 
Du tanh 2 G, (21) 



where Du = due u2 / 2 /v^27r is the Gaussian measure, G = rh + \fqu. The overlap for the 
one-stage decoding process is given by 

I 777 

^^Esgn^Herf-^. (22) 

Now we consider selective freezing. If we introduce a freezing threshold 6 so that all spins 
with (o"j) 2 > 9 2 are frozen, then the freezing fraction / is given by 

/ = _L V 9 (V,) 2 - 6 2 ) = 1 - -erf ^ + -erf ^, (23) 
J N \\ */ J 2 v/2 2 y/2 

where u± = (±«o — in)/ Vq with tanhw = 9. 
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C. Average spin in the second stage 



Assuming that the spin a% is dynamic in the second stage, we can write 

H{a} « H{a}* -0 £ a x J lh ... jp _ x \[ [a j3 Q (O 2 - (a j3 ) 2 ) + sgn(^)6 ({a js } 2 - Q' 

i<ii-<ip-i 8=1 

(24) 

where H{a}* is the Hamiltonian when spin 1 is completely removed from the system in both 
stages of the thermodynamic process. Removing spin 1 may cause the thermal averages of 
other spins to adjust slightly in the first stage. Hence some dynamic spins (with (<7k) 2 < 6 2 ) 
may become frozen ones (with (cr^) 2 > 9 2 ) and vice versa, so that strictly speaking, further 
terms should be considered in Eq. (|24] ) to account for these secondary effects. For example, 
if spin k is induced to switch from dynamic to frozen (or vice versa) on removal of spin 1, 
then the Taylor expansion of H{a} implies that an extra term 

-P (sgn^)^ - cr fc ) [S((a k )* -0)- S((a k )* + 6)] ((a k ) - (a^ 1 ) 

£ , ff {^ s e [e 2 - «O v ) 2 ] + sgn^^e {(i*^ 1 ) 2 - e 2 } } (25) 

Kii-<i P -i^fc s=1 

should be incorporated in Eq. (Ell). Here, we have neglected these terms for clarity. Never- 
theless, justification a posteriori can be provided for their deletion. 
Using a cavity argument similar to Appendix A, we can show that 

(*i> = tanh^ ( £ •/,,, ,,. ; ff [&*} V ® i° 2 ~ (^) 2 ) + sgn^J^e ((a, s ) 2 - & 
[i<ji-<j P -i 



8=1 



(26) 



However, the effective field on the right hand side of Eq. (^6|) is still not a cavity field 
because (cr Js ), which are used in the step functions to decide whether the spin j s is dynamic 
or frozen in the second stage, is different from (crj s )*. Hence it may have correlations with 
spin 1. Taylor expansion of (<7j a ) about (crj a )* yields 



(<7i) =tanh/3j/ii+ £ J im~j P -2 

ff [° 2 - «^> V1 ) 2 ] + Bgn^J^e [{{o js )^) 2 - e 2 

s=l 

sgn(a J )\ 1 - fa)*] fafo)^ - 0) - *(fo> V + 0)1 (fa) - fa}*)), 



(27) 



where /ii is the generic cavity field which is now completely uncorrelated with spin 1. It is 
given by 



S 



hi = E ->-r j,. : II {<5-i.) Xl e - «^> U ) 2 ] + sgn^^e [((^ s )\ 1 ) 2 - £ 2 ] } . 

Kj'x-<Jp-i s=l 

(28) 

To evaluate the difference (<x,) — (crj)^ 1 appearing in Eq. Q2"7|), we have to apply the cavity 
method a second time, by comparing the changes when both spins 1 and j are removed. 
This is done in Appendix B and the result is 



(<jj) - (a^ 1 = (psech 2 f3hf) (hjitanhphY) 



where 



hlj — hji— E Jljk-i-kp-^kx) 
ljy^ki—<kp-2 



(29) 



(30) 



When Eqs. (^9| - pOD are substituted into Eq. (|27|) , the significant contribution comes from 
the terms which pair up Jijj v --j p _ 2 an d Jxjk 1 -k p -- i - The various terms appearing in the 
summation over j ^ ji < ■ ■ ■ < j p -2 involve thermal averages in the absence of spins 1 
or j. We assume that the effects of removing a spin is negligible (which can be shown 
to be equivalent to the replica symmetric approximation in the replica method ||12|| ). Then 
replacing the components of the terms by their mean values, and counting that N p ~ 2 / (p— 2)! 
terms appearing in the summation over jx < ■ ■ ■ < jp-2, we arrive at 



(*!> = tanh/3 1 h l + P -{p- 1) J 2 ^ £ \5{{a 3 ) -9)- 5((a 3 ) 



1 



[sgn(^) - (cr,-)] 



(/3sech 2 phj) (r p - 2 tanh(3h, 



i h 



(31) 



where r is the order parameter describing the spin correlations of the two thermodynamic 
stages: 

1 



r = 



(32) 



Eq. ([H]) can be simplified by introducing the trans-susceptibility xtr, which describes the 
response of a spin in the second stage to variations of the cavity field in the first stage, 
namely 



= 1 x ^ d(ai) 

Xtr - N dh ■ 

Since (c^) equals sgn/i^ for tanh 2 /3/ij > # 2 , and tanh/?/^ otherwise, we get 
Xtr = ^ E " " <K<^> + 0)] [sgn(a,) - fo)] /?sech 2 /^. 



Eq. fl3~I|) can thus be simplified to 



P, 



((Jx) = tanh/3 <j + ^(p — l)J 2 r p 2 % <r tanh/3/ii 



(33) 



(34) 



(35) 
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D. Order parameters in the second stage 



The cavity field h\ in the second stage is a Gaussian variable. Its mean and variance are 
pjorh p ~ 1 and pJ 2 q p ~ 1 /2 respectively, where rh and q are the magnetization and Edwards- 
Anderson order parameter respectively, given by 



i 



(36) 
(37) 



Furthermore, the covariance between hi and hi is pj 2 r p 1 /2, where r is given in Eq. (|3^ 
Algebraic manipulations can be simplified if we write, for i — 1, 



= rh + Jqu 



Phi = rh + V g(r]Uj + v/1 - t? 2 ^) 



(38) 
(39) 



where «j and i>, are independent Gaussian variables with mean and variance 1, rh, q are 
given in Eq. (|T9|), and 



m = p(3j rh p 1 , and q = ^[3 2 J 2 q p 1 , 



and r] 



r 



(40) 
(41) 



Applying self-consistently the same cavity argument to all terms in Eqs. (p6|), (|37D , fl32|) 
and (|3"4] ) and performing the Gaussian average over Uj and t>j, we arrive at the following 
self-consistent equations for rh, q, r and Xtr- 



Xtr 

where 



1 r u + 1 „u_ 

m = — ert^= erf— = + 

2^2^/2 

1 r u + 1 r U- 
q = 1 - -erf— p + -erf —= + 
2^2^/2 



Du J DviaxiYiL, 
Du J Dvtanh 2 L, 

+ / I Du |tanhG| + / + Du [ Dvtanh GtanhL, 



oo J u~\ 
2 



exp(— u + /2) 
J\Jixpq p ~ x 



Dv(l - tanhL 



1 + tanhL, { 7 ) } 



L = rfi + y/jjfcrru + ^Jl- r) 2 v) + | (p - 1 )/3 JV" 2 ^ tanhG, 
L(±) = ^ + Vg(r/ M± + /l - ?7 2 i;) ± |(p - l)pj 2 r p - 2 X t r 0- 



(42) 
(43) 
(44) 
(45) 

(46) 
(47) 



Eqs. (gOj-gl 



for the order parameters m, q, rh, q, r and Xtr form a close set of 



equations. The performance of selective freezing is measured by 
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M sf = i E [©(^ - (^) 2 )sgn(^> + e((^) 2 - Osgnfo)] . (48) 

i 

From the above parameters, M s f can be derived as: 

M sf = --erf^ - ierf^ + f + Duerf . Lu =, (49) 



where L u = m + + [p(p — l)/2]/5J 2 r p_2 x ir tanhG. 

We have also tried to derive the above equations using the replica method. However, 
in the nearest results that we could find, terms involving the trans-susceptbility are absent, 
which we believe to be unphysical. Therefore the replica approach to the order parameter 
equations remain an open question. 

We show an example of the case p = 2 and jo = J = 1 in Fig. J], where the overlap 
M s f is plotted as a function of the decoding temperature T(= [3~ l = (3~ l ) for various given 
values of freezing fraction /. When / = (no spins frozen) and / = 1 (all spins frozen), the 
dynamics is equivalent to one with single stage, and the overlap reaches its maximum at the 
Nishimori point T = J 2 /2j as expected. We observe that the tolerance against variations 
in T is enhanced by selective freezing for certain values of /. 

It is therefore interesting to consider the appropriate values of / for the best overlap at a 
given decoding temperature. Figs. @(a-f) shows that at high temperatures such as in Figs. 
^|(a-c), there is a single maximum and its position is fairly independent of temperature, lying 
around / = 0.9 in the present case. At intermediate temperatures such as in Figs. 0(d-e), 
there appear two maxima and as temperature changes, there is a discontinuous jump in the 
maximum position. Fig. [|(f) shows that when the temperature is lower than the Nishimori 
point (T/v = 0.5), the overlap cannot be improved by selective freezing. 

Figure |] compares the overlap of the one-stage dynamics with that of the best of selective 
freezing. It shows that when the decoding temperature is mis-determined to be higher 
than its optimal value at the Nishimori point, selective freezing can provide a fairly robust 
performance. Furthermore, the choice of the freezing fraction for such robust performance 
appears to be quite independent of the temperature. The solid line in Fig. [| locates the 
position for the best overlap and, as observed from Figs. |2|(a-f), lies in the vicinity of / ~ 0.9 
for a large range of temperature. The unshaded region in the same figure also indicates that 
selective freezing leads to an improvement in the overlap over a wide range of the parameter 
space. 

We have also studied the dependence of the overlap on varying the freezing threshold 6 
rather than the freezing fraction /. However, Fig. [| shows that the optimal value of 9 has 
a much larger dependence on the temperature. This is due to the sensitive dependence of 
the thermal averages of the spins on temperature. At high temperatures, most spins are 
thermally agitated, and the freezing threshold has to be set to a very low value in order to 
freeze a given fraction of spins. On the other hand, at low temperatures, most spins are 
relatively stable, and the freezing threshold has to be set to a very high value in order to 
keep a given fraction of spins dynamic in the second stage. We conclude that the freezing 
fraction is a better controlling parameter for the decoding performance. 

The advantages of selective freezing are confirmed by Monte Carlo simulations shown in 
Fig. p. For one-stage dynamics, the overlap is maximum at the Nishimori point (T/v = 0.5) as 
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expected. However, it deterriorates rather rapidly when the decoding temperature increases. 
In contrast, selective freezing maintains a more steady performance, especially when / = 0.9. 



IV. THE MEAN-FIELD MODEL FOR IMAGE RESTORATION 

In conventional image restoration problems, a given degraded image consists of the set of 
pixels {-Tj}, but not the set of exchange interactions {Jn, -,^}- On the other hand, effective 
restoration requires the introduction of a model prior distribution of the pixels for smooth 
images. In this case the Hamiltonian corresponds to that of a random field Ising model, 

H{a} = -hJ2 T iVi ~ — J2 a i a r ( 50 ) 

In mean-field systems, each pixel i has an extensive valency. The pixels Tj are the degraded 
versions of the source pixels corrupted by noise which, for convenience, is assumed to be 
Gaussian with mean a£i and variance r 2 , i.e. 



cxp 



o T 2 a C 



p ^ = 1 7^ • (51) 

In turn, the source pixels satisfy the prior distribution in Eq. ([[]). Applying the cavity 
argument for mean-field systems, the prior distribution becomes factorizable, 

= (52) 

2 cosh p s m 

where mo = tanh/3 s mo. The order parameter in the first stage is given by 

11 r 

m = -T=2^(<Ti) = — exp(/? s m / DxttmhU, (53) 

N ^ 2cosh/5 s m f t±i J 

where U = (3 m m + ha£ + hrx. The overlap for the one-stage restoration process is given by 
M = T^J2 = „ * E ex P (/3 s m e)eerf ^±M. (54 ) 



N t 2 cosh f3 s m £±\ \/2h 



T 



Next we consider selective freezing in the second stage with a freezing threshold 9. The 
freezing fraction is given by 



2 2 y/2 



(55) 



where u±(£) = (±Mo — /3 m m — ha£) / hr with tanh uq = 9. The order parameter of the second 
stage is given by 
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1 



£ {<d(6 2 - (a,) 3 )^) + &((a t } 2 - £ 2 )sgn(aA 



N 




u_(0 



■«+(£) 



DxtanhL 



(56) 



where L = /3 m m + /ia£ + hrx. The overlap for selective freezing is given by 




(57) 



where 



g(Pmm) 



(3 m m - u 
p m fh 

(3 m m + m 



/3 m m < (3 m m - uq, 

(3 m m - u < (3 m m < (3 m m + u , 

f3 m m > f3 m m + u . 



(58) 



We note that since the spin-glass interaction is absent in this case, there are no trans- 
susceptibility effects. This is unlike the case of error-correcting codes, in which xtr is nonzero 
when J is nonzero. 

The three cases of the function g{f3 m m) in Eq. ([58]) correspond to three situations. When 
(3 m ih < j3 m m — uq, all the dynamic spins in the second stage have negative thermodynamic 
averages and therefore take the value —1 in the two-stage restoration process. This is 
equivalent to a one-stage restoration process in which all spins with thermodynamic averages 
above the threshold +8 are frozen to +1, and to —1 otherwise. Similarly, when (3 m m > 
(3 m m + Uo, all the dynamic spins in the second stage have positive thermodynamic averages. 
Only when f3 m m — u < f3 m m < (3 m m + uq, do we have the dynamic spins frozen to partly 
+1 and partly — 1. 

We can consider the condition for the optimal performance M s f of selective freezing. For 
a given distribution of data and noise, g(/3 m m) is the only adjustable parameter in Eq. ([57D , 
playing the same role as the adjustable parameter f3 m m for one-stage dynamics in Eq. (|54l). 
In the space of h and /3 m , the performance is optimal along the line h/j3 T = (3 m m/ (3 s mQ for 
one-stage dynamics || (/3 T = a/r 2 for Gaussian noise). Analogously, there exists a line of 
optimal performance defined by h//3 T = g(/3 m rh) / (3 s mo for selective freezing. 

An example of the lines of optimal performance is shown in Fig. |7|. It is interesting 
to note the kinks for certain freezing fractions. They correspond to transitions of cases in 
which the dynamic spins are partially or completely frozen to ±1. 

A comparison of Eqs. Q53D and fl57| ) shows that selective freezing performs as well as 
one-stage dynamics, but cannot outperform it. Nevertheless, selective freezing provides a 
rather stable performance when the hyperparameters cannot be estimated precisely. In 
image restoration, the usual practice is to choose a fixed ratio of (3 m /h. Fig. § confirms 
this stability along the line of operation with f3 m /h set to the optimal ratio j3 s /j3 T . Note 
especially that the lines with / = 0.7 and 0.9 attain a nearly optimal value of M s f over a 
wide range of parameters. The kink at / = 0.9 is, again, due to the appearance of the — 1 
frozen dynamic spins (to the right of the kink). 
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The stable performance of selective freezing can be partly explained by the proximity of 
the lines of optimal performance with the line of operation which, as discussed in [[|, is an 
important factor in hyperparameter estimation. This is illustrated by the optimal lines for 
small values of / near the Nishimori point (T m , h) = (1.05 -1 , 1) in Fig. |7|. 

However, the advantage of selective freezing does not only rely on the fortuitous combi- 
nation of parameters. Even when the parameters are not chosen optimally, selective freezing 
still maintains a rather robust performance. For example, along the line of optimal perfor- 
mance for / = 0.9 in Fig. [7[ the bending at the kink only causes a modest reduction in the 
overlap M sf in Fig. [S[ 

To study the robustness of the performance of selective freezing, we model a situation 
common in modern communication channels carrying multimedia traffic, which are often 
bursty in nature. Since burstiness results in intermittent interferences, we consider a noise 
with two Gaussian components, each with its own characteristics. A random fraction fi of 
the pixels are influenced by Gaussian noise with signal strength and noise variance r\. 
The rest of the pixels have strength a 2 and noise variance r|. Hence the distribution of the 
degraded pixels are 

exp -53(7? - ai&) 2 exp -^(r* - a 2 &) 2 
P(rMi) = h 1 V- 1 + h 1 2 V- K (59) 



where / 2 = 1 — fi. The equations for the order parameters can be generalized from the 
single component case in a straightforward manner. 

A case of interest is that the restoration agent operates on the assumption of the char- 
acteristics of the majority component of the channel, say the first component. Hence it 
operates at the ratio /3 m /h = /3 s rf/ai. Suppose the Gaussian noise is partly interrupted to 
take the characteristics of the second component, but the operation parameters cannot be 
adjusted soon enough, then there will be a degradation of the quality of the restored images. 
In the example in Fig. [|, the reduction of the overlap M s f for selective freezing is much more 
modest than the one-stage process (/ = 0). 

An alternative situation is that the restoration agent is able to detect the changes in 
the average signal strengths and noise variance, but still operates on the assumption of a 
single-component Gaussian channel. Suppose that such simple statistics as (sgnrj), (tj) and 
(rf) are accessible. Then the parameters mj, a* and r* estimated by the restoration agent 
are obtained, for T\ — r 2 — r, from the solutions of 



ru^erf—^ — = (sgnxj) = mo 



V2 



/ierf-^— + / 2 erf- ° J 



m *a* = ( Ti ) = m [/iai + / 2 a 2 ] , (61) 
a* 2 + r* 2 = (r 2 ) = h{a\ + rf) + f 2 (aj + r 2 2 ), (62) 

and (3* = tanh _1 mo/mQ. Using these estimated parameters, the performances in Fig. |10| 
improve over their counterparts based on only the majority component in Fig. |9|. Still, 
one-stage restoration cannot avoid the performance drop when h vanishes, whereas corre- 
spondingly, selective freezing has a much more gentle drop in performance. 

It is interesting to study the more realistic case of two-dimensional images, since we have 
so far presented analytical results for the mean field model only. As confirmed by the results 
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for Monte carlo simulations in Fig. [11], the overlaps of selective freezing are much more 
steadier than that of the one-stage dynamics when the decoding temperature changes. This 
steadiness is most remarkable for a freezing fraction of / = 0.9. 

V. DISCUSSIONS 

We have introduced a multistage technique for error-correcting codes and image restora- 
tion, in which the information extracted from the former stage can be used selectively to 
improve the performance of the latter one. While the overlap M s f of the selective freezing is 
bounded by the optimal performance of the one-stage dynamics derived in 0, it has the ad- 
vantage of being tolerant to uncertainties in hyperparameter estimation. The performance 
is especially steady when the fraction of frozen spins, rather than the threshold of their 
thermodynamic averages, is fixed in the process. This is confirmed by both analytical and 
simulational results for mean-field and finite-dimensional models. As an example, we have 
illustrated its advantage of robustness when the noise distribution is composed of more than 
one Gaussian components, such as in the case of modern communication channels supporting 
multimedia applications. 

We found that selective freezing is most useful when more than one hyperparameters 
have to be estimated, as illustrated by the example of image restoration, where both f3 m and 
h have to be estimated. In the example of error-correcting codes discussed in Section III, 
there is only one hyperparameter T m , and it is found that selective freezing has performance 
advantages only when T m is chosen above the Nishimori point. However, more than one 
hyperparameters are often present in practical applications. 

Selective freezing can be generalized to more than two stages, in which spins that remain 
relatively stable in one stage are progressively frozen in the following one. It is expected 
that the performance can be even more robust. 

While the multistage process described here has a robust performance, it does not raise 
the critical temperature or the critical noise level for the existence of the ordered phase. 
Nor can it widen the basin of attraction for the ordered phase. Other multistage processes, 
proposed in [[H]] for neural networks, may be able to achieve this. This remains an area for 
further research. 

We have made progress in the theoretical treatment of multistage processes using the 
cavity method. It allows the thermal averages of spins to be expressed in terms of the cavity 
fields. Since a cavity field is uncorrelated with the spin in consideration, it can in turn be 
expressed in terms of the means and covariances of the spin averages, thereby arriving at 
a set of self-consistent equations for the order parameters. In particular, there appears a 
trans-susceptibility term since variations of the cavity field in the first stage are correlated 
with the spin average in the second stage due to the selective nature of the freezing process 
in the second stage. However, for the ordered phase considered in this paper, the effects of 
the trans-susceptibility term is not too large except near the phase boundary. 

On the other hand, we have a remark about the basic assumption of the cavity method, 
namely that the addition or removal of a spin causes a small change in the system describable 
by a perturbative approach. In fact, adding or removing a spin may cause the thermal 
averages of other spins to change from below to above the thresholds ±8 (or vice versa). 
This change, though often small, induces a non-negligible change of the thermal averages 
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from fractional values to the frozen values of ±1 (or vice versa) in the second stage. The 
perturbative analysis of these changes is only approximate. The situation is reminiscent of 
similar instabilities in other disordered systems such as the perceptron, and are equivalent 
to Almeida-Thouless instabilities in the replica method |13| . A full treatment of the problem 



would require the introduction of a rough energy landscape ||13|| , or the replica symmetry 



breaking ansatz in the replica method [[TT[]. Nevertheless, previous experiences on disordered 
systems showed that the corrections made by a more complete treatment may not be too 
large in the ordered phase. For example, corresponding analytical and simulational results 
in Figs. [I] and ^| respectively are close to each other. 

In practical implementations of error- correcting codes, algorithms based on belief- 
propagation methods, rather than Monte Carlo methods, are often employed fll4 |. It has 



recently been shown that such decoded messages converge to the solutions of the TAP equa- 
tions in the corresponding thermodynamic system ||15|| . Again, the performance of these 



algorithms are sensitive to the estimation of hyperparameters. We propose that the selec- 
tive freezing procedure has the potential to make these algorithms more robust. 

Incidentally, multistage dynamics has also been applied in the recently popular turbo 
codes [fL6] . Messages are coded in sequences with two possible permutations and at each 



iterative stage, the information derived from decoding one sequence is fed to the other in 
the form of external fields for each bit. The techniques developed in the present context can 
be used to study this iterative process. 
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APPENDIX A: THERMAL AVERAGES OF SPINS 

In this appendix we derive Eq. (|16|) starting from the clustering property Eq. (p~5|) . For 
convenience we illustrate the derivation for p = 2. We separate the Hamiltonian into two 
parts, one does not contain o\ and the other does. Hence 

H = H^ -PY, J ij°i°j- ( Al ) 

Thus the thermal average can be written as 

Tr^e - ^ lr Tria"i exp f, 

(-1) = ^ ^ — 7 ^ ^ • (A2) 



Tr^e H 1 Trx(7 1 exp 




) /Tr^e-^ 1 


r j r \ 1 e -H\ lr j ri eX p ^ 


/3a 1 Y,j JijCTj) 





Expanding the exponential function in the denominator and tracing over a, we get 

Den - = 2 E § E % • • • • • • (A3) 

n even j 1 ---j n 
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Next, we use the clustering property to factorize the thermal average (o^ • ■ ■ crj^ 1 . For the 
coupling distribution specified by Eq. (|Hfl, only two kinds of contributions are significant in 
the summation over the indices j\ ■ ■ ■ j n . In the first kind, an index j remains distinct from 
the rest, contributing a factor of Jij{aj)^. In the second kind, two indices become paired up. 
However, when j and k pair up, the thermal average (ajak)^ 1 becomes 1 instead of ((crj)^ 1 ) 2 . 
Hence the additional contribution due to the pairing is Jj[l - (((Tj) V ) 2 ]- Other than these, 
the contributions due to the pairing of three or more indices are smaller by factors of N. 

The denominator can now be considered a summation over n and m, which are respec- 
tively the total number of indices and the number of pairs of paired indices appearing in a 
term. The number of such terms is n\/m\2 m (n — 2m)!. Hence 



Den. 



«/2 n n 

E E 



n even m=0 



n! m!2 r 



in 



2m)! 



E J 1jW 



\i 



which can be simplified to 



n— 2m 



HA 



(fo) X1 ) 31 



(A4) 



Den. = 2exp^-/? 2 £jj[l-((a,> 



\1\2 



Similarly, the numerator can be written as 

V. 

3 



\i 



Num. = 2exp j-/3 2 £ J% [l - ((a^ 1 ) 2 ] \ sinh \ /?£ ^-(a^ 1 
Substituting Eq. (KB and into Eq. we arrive at Eq. (M). 



(A5) 



(A6) 



APPENDIX B: CHANGE IN THERMAL AVERAGES ON REMOVAL OF A SPIN 

In this appendix we derive Eq. (p9|) . For convenience we illustrate the derivation for 
p = 2. We separate the Hamiltonian into four parts: (a) does not contain spins 1 and j, 
(b) contains only spins 1 and j, (c) contains spin 1 but not j, (d) contains spin j but not 1. 
This yields 

H = H^ lj - $J X jO\Oj - /3 E Jki°kVi - (3 E JkjVkVj- (Bl) 

The thermal average of Oj can then be written as 

Th •Tr^-V'V/Tr^V^ 1 ' 
N J/ TryTr^e-^/Tr^e-^ 13 V ; 

Using the mean-field technique developed in Appendix A, the denominator can be written 

as 

Den. = Try exp I [3 J\jV\Vj + E ( a k)^ 13 (Jki&i + hjVj) 

+ \? E [l " (K> W ) 2 ] (Jki°i + Jk^) 2 }. (B3) 



k+\j 
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After collecting terms and discarding negligible ones, 

Den. = Try exp J (3a t £ J lk (a k )^ + (3a 3 £ J jk (a k )^ + pj^Wj + (3 2 {l — q)J 2 1 . (B4) 

Together with a similar manipulation of the numerator, we obtain 

(aj) = tanh (3 (hf + J jX tanh fih\ j ) , (B5) 
whose Taylor expansion yields 

( aj ) = (aj)^ + (/feech 2 /^) 1 ) ( tanh f3h\ j ) , (B6) 
which becomes Eq. fl29D for the case p = 2. 
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FIGURES 




FIG. 1. The overlap M s f as a function of the decoding temperature T for p = 2 and jo = J = 1 
for various given values of freezing fraction /. In this and the following figures, / = corresponds 
to one-stage decoding/restoration. 
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FIG. 2. The overlap M s f as a function of the freezing fraction / at temperatures T=(a) 1.5, 
(b) 1.2, (c) 1.0, (d) 0.8, (e) 0.6, (f) 0.4. for p = 2 and j = J = 1. 
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FIG. 3. The temperature dependence of the best overlap of selective freezing compared with 
the overlap of the one-stage dynamics for p = 2 and jo = J = 1. 
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FIG. 4. The freezing fraction / for the best overlap as a function of temperature T for p = 2 
and jo = J = 1. In this and the following figure, solid line: global maximum, dashed line: local 
maximum, shaded region: M s f < M. 
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FIG. 6. Results of Monte Carlo simulations for the overlaps of selective freezing compared 
with that of the one-stage dynamics for p = 2 and jo = J = 1, corresponding to Fig. [l]. The 
simulation parameters are: N = 1000 with an initial overlap of 0.9 and 200 samples. Each stage 
consists of 500 Monte Carlo steps per node for equilibration and 1000 Monte Carlo steps per node 
for averaging. 
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h 

FIG. 7. The lines of optimal performance in the space of the random-field strength h and the 
restoration temperature T m = in the mean-field model of image restoration for a = r = 1 and 
(3 S = 1.05. The dotted line is the line of operation with f3 m /h set to the optimal ratio (3 s /f3 T = 1.05. 
At / = 0.9 the dynamic spins are completely frozen to +1 to the left of the kink, but only partially 
to the right. 
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FIG. 8. The performance of selective freezing at a = r = 1 and (3 S = 1.05, with (3 m /h set to 
the optimal ratio (3 S /(3 T = 1-05 for various freezing fraction /. 
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FIG. 9. The performance of selective freezing with 2 components of Gaussian noise at 
= 1.05, fi = 4/2 = 0.8, ai = 5a2 = 1 and t\ = T2 = 1. The restoration agent operates 
assuming the majority component, i.e. f3 rn /h = fi s r\ja\. 
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FIG. 10. Same as Fig. [| except that the restoration agent operates with the ratio 
Pm/h = (3*T* 2 /a*, where /?*, r* and a* are estimated from Eq. (B0)-(B2t). 
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FIG. 11. Results of Monte Carlo simulations for the overlaps of selective freezing compared 
with that of the one-stage dynamics for two-dimensional images generated at the source prior 
temperature T s = 2.15. The simulation parameters are: ./V = 50 x 50, with an initial overlap of 0.8 
and 1000 samples. Each stage consist of 1000 Monte Carlo steps per node for averaging. 
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